Skip to content

Local reply and upstream response rewrites using the FilterManager#14926

Closed
esmet wants to merge 10 commits intoenvoyproxy:mainfrom
esmet:response-map-filter-manager
Closed

Local reply and upstream response rewrites using the FilterManager#14926
esmet wants to merge 10 commits intoenvoyproxy:mainfrom
esmet:response-map-filter-manager

Conversation

@esmet
Copy link
Contributor

@esmet esmet commented Feb 3, 2021

Work in progress converting #14221 to behavior in the filter manager.

We'll probably want to tease apart the LocalReply case and the upstream rewrite case, perhaps by renaming some of the variables/objects. Feedback welcome.

Commit Message:
Additional Description:
Risk Level:
Testing:
Docs Changes:
Release Notes:

esmet added 2 commits February 3, 2021 17:15
Signed-off-by: John Esmet <john.esmet@gmail.com>
Signed-off-by: John Esmet <john.esmet@gmail.com>
// Encode the rewritten response right away if the original `modified_end_stream` was true.
// If it wasn't, then we know a body will be encoded later, and we'll let that function
// take care of encoding the rewritten response.
if (do_rewrite_ && original_modified_end_stream && buffered_response_data_->length() > 0) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the constant checking for buffered_response_data->length() > 0 would be improved if the local_reply_.rewrite interface provided a way of returning a null optional for the rewritten body, in the case where there is no configured body rewrite.

Signed-off-by: John Esmet <john.esmet@gmail.com>
@esmet esmet changed the title Response map filter manager Local reply rewrites using the FilterManager Feb 5, 2021
// Try to rewrite the response. This may end up not doing any actual rewrite if this is
// a local reply and there is no matching rule. If it's not a local reply, we know there's
// a match at this point since that's a condition for state_.do_response_rewrite_ above.
rewriteResponse();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At this point in the code, we've already run through the filter chain.

Question: should we do response rewriting before or after the filter chain? If we do it before, then the filter chain has an opportunity to see the rewritten response and make its own changes. If we do it after, then we give response rewriting the final say and the highest precedence.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed offline, if we can tell ahead of time we're doing a rewrite, I think we want to let the filters see the reply that is going to the client. If they do their own stats tracking of say "how many 500s happened" I think they want to see what will be written.
@snowp @lizan can I get a non-googly take on this question?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One interesting case to consider is when a filter itself returns a status (or sets a header) that would match a rewrite rule. For example, if ext authz returned 403, the user may want the same 403 rewrite rule to apply as it would have if an upstream service had returned it. Same story might reply to a sentinel header like x-my-custom-failure-header that could get set by your upstream services or by a custom filter. Either way the user might want the same rewrite rules to match/apply.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this is a tricky one. I haven't yet thought about this a huge amount, but I think in a perfect world we would do something like the following:

  1. Unconditional rewriting happens before the encoder filter chain runs.
  2. Conditional rewriting (matching) actually is attempted after each encoder filter runs. This I think is the behavior we want where it allows each filter to change something and then possibly have a transform happen before the next filter runs.

I do wonder if somehow this could be built using the new matching infra that @snowp created but I haven't looked at it in detail yet.

Let me know if it would be easier to chat about this in a short meeting or in a doc.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, that matched my thought, where (I think) this one qualifies as unconditional and would happen first.

return;
}

buffered_response_data_ =
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I decided to use buffered_response_data_ since the filter chain uses this during addEncodedData / modifyEncodingBuffer, and it seemed appropriate. Is this reasonable?

esmet added 2 commits February 8, 2021 16:59
Signed-off-by: John Esmet <john.esmet@gmail.com>
Signed-off-by: John Esmet <john.esmet@gmail.com>
// member that stays nullptr when no local reply config is present to save space in
// the 'off' case. But, for now, we spend two pointers of space in all cases.
const LocalReply::LocalReply& local_reply_;
Utility::LocalReplyDataPtr local_reply_data_;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is still a TODO since I think it will be a requirement to not regress memory usage in the case where Envoy is not configured to use any local reply rewrites.

// Tracks if headers other than 100-Continue have been encoded to the codec.
bool non_100_response_headers_encoded_ : 1;
// True if we should rewrite the upstream response using local_reply_
bool do_response_rewrite_ : 1;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could probably move this state bit into local_reply_data_ (perhaps renamed as local_reply_context_/state_ or whatever) but this approach worked quickly and IIUC there should already be a bit available in this struct without causing an extra byte to the struct space.

// Local reply rewrites (and upstream rewrites, for that matter) happen on the filter
// chain now, so we do not need to do any additional rewriting here.
/*rewrite_=*/nullptr,
// Local gRPC replies are encoded as trailers-only responses on the filter chain,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For consistency with the above comment I should explicitly call out "but not upstream gRPC replies"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, would it work to have an ASSERT above this function where we check that if it's a gRPC reply that there's no body or trailers? Then we could move most of this comment out and say // As documented above, no special gRPC logic is necessary here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call

},
[&](ResponseHeaderMap& response_headers, Code& code, std::string& body,
absl::string_view& content_type) -> void {
/* Direct local reply rewrites happen here since we are not going through the filter
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should explicitly clarify "must happen here since..."

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just to check my understanding, local reply rewrites that currently exist don't go through the filter chain? Because local replies as a whole do AFIK

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Local replies in the general case do go through the filter chain. However, in this direct reply case (e.g. on downstream timeouts where headers have already been sent), we are not going through the filter chain so we continue to call local_reply_.rewrite here.

[&](ResponseHeaderMap& headers, Code& code, std::string& body,
const absl::optional<Grpc::Status::GrpcStatus> grpc_status,
bool is_head_request) -> void {
/* Direct local reply gRPC encoding happens here since we are not going through the
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same re: "must"

// a response body.
const bool original_modified_end_stream = modified_end_stream;

// See if we should do an upstream rewrite. For local replies, always try to do a rewrite.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"should do a response rewrite"

@antoniovicente
Copy link
Contributor

@esmet Is there someone you want early feedback from as you continue working on this draft PR?

@esmet
Copy link
Contributor Author

esmet commented Feb 9, 2021

@esmet Is there someone you want early feedback from as you continue working on this draft PR?

It would be good to get some early feedback from the reviewers (and/or assignees) of #14221 though I've been talking to @alyssawilk offline a bit about it as well. I could also just pull it out of draft. There are new tests and the minimal thing seems to work. What do you think?

@esmet esmet changed the title Local reply rewrites using the FilterManager Local reply and upstream response rewrites using the FilterManager Feb 9, 2021
@antoniovicente
Copy link
Contributor

@esmet Is there someone you want early feedback from as you continue working on this draft PR?

It would be good to get some early feedback from the reviewers (and/or assignees) of #14221 though I've been talking to @alyssawilk offline a bit about it as well. I could also just pull it out of draft. There are new tests and the minimal thing seems to work. What do you think?

That answers my question. Keeping it as a draft is fine. Let me know if you want me to make a pass since I have some familiarity with HTTP processing paths.

Thanks for working on this important improvement to Envoy error handling.

@esmet
Copy link
Contributor Author

esmet commented Feb 9, 2021

@esmet Is there someone you want early feedback from as you continue working on this draft PR?

It would be good to get some early feedback from the reviewers (and/or assignees) of #14221 though I've been talking to @alyssawilk offline a bit about it as well. I could also just pull it out of draft. There are new tests and the minimal thing seems to work. What do you think?

That answers my question. Keeping it as a draft is fine. Let me know if you want me to make a pass since I have some familiarity with HTTP processing paths.

Thanks for working on this important improvement to Envoy error handling.

I would definitely appreciate your review. The quick context here is that we decided that the http extension implementation wouldn't work in all cases so I'm trying to move it into the FM. This is my first deep dive into HCM/FM so I chose patterns that seemed to work though they may not be the right way - feedback appreciated. Per-route config is still TODO.

Signed-off-by: John Esmet <john.esmet@gmail.com>
Signed-off-by: John Esmet <john.esmet@gmail.com>
Signed-off-by: John Esmet <john.esmet@gmail.com>
Signed-off-by: John Esmet <john.esmet@gmail.com>
Copy link
Contributor

@alyssawilk alyssawilk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I want to get general take on when we apply rewrites before I dig in too much. Tagging snow/lizan for that and then I'l do another pass :-)

// Local reply rewrites (and upstream rewrites, for that matter) happen on the filter
// chain now, so we do not need to do any additional rewriting here.
/*rewrite_=*/nullptr,
// Local gRPC replies are encoded as trailers-only responses on the filter chain,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, would it work to have an ASSERT above this function where we check that if it's a gRPC reply that there's no body or trailers? Then we could move most of this comment out and say // As documented above, no special gRPC logic is necessary here.

// A bit awkward, but save the local reply data for later. This will come in handy.
// When this pointer is non-null, it means a local reply has been initiated, which
// is important later on.
local_reply_data_ = std::make_unique<Utility::LocalReplyData>(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like there's got to be a way to avoid this. If I'm wrong (and I probably am given you've probably tried =P) let's rewrite to just explain why we do this
// latch the local reply data, as it us used later in blahblah function

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the best explanation is that we may need some of this context if we end up doing the rewrite in encodeData (because an upstream response exists and needs to be dropped / replaced by the rewritten body, so this cannot happen in encodeHeaders alone)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After reviewing again, I agree this is pretty dirty but it currently serves an important purpose. Mostly, to have the local reply code path communicate to the canonical rewrite path that the local reply is grpc, is a head request, the grpc status, local reply body etc. I think this would seem cleaner, at least on the surface, if there were one member local_reply_context_ that held both local reply data (if any) and a pointer to a local reply object that should be used for the request. This could also give us an opportunity to read per-route config and override that local reply object to the per-route variant. The alternative would be to create the FM with the per-route local reply rewriter, but I'm not sure yet which makes the most sense from a lifecycle perspective.

},
[&](ResponseHeaderMap& response_headers, Code& code, std::string& body,
absl::string_view& content_type) -> void {
/* Direct local reply rewrites happen here since we are not going through the filter
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just to check my understanding, local reply rewrites that currently exist don't go through the filter chain? Because local replies as a whole do AFIK

// Try to rewrite the response. This may end up not doing any actual rewrite if this is
// a local reply and there is no matching rule. If it's not a local reply, we know there's
// a match at this point since that's a condition for state_.do_response_rewrite_ above.
rewriteResponse();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed offline, if we can tell ahead of time we're doing a rewrite, I think we want to let the filters see the reply that is going to the client. If they do their own stats tracking of say "how many 500s happened" I think they want to see what will be written.
@snowp @lizan can I get a non-googly take on this question?

Signed-off-by: John Esmet <john.esmet@gmail.com>
@alyssawilk
Copy link
Contributor

cc @goaway for your thoughts w.r.t. local reply error handling

@alyssawilk
Copy link
Contributor

yep, sorry, meant to tag @mattklein123 and @snowp on this one for if filtering the response should happen before or after.

@esmet
Copy link
Contributor Author

esmet commented Mar 22, 2021

@alyssawilk @snowp circling back to this, do we think the HCM response mapping proposed in this PR is a good path forward? To Alyssa's point, when should we perform the rewrite, before or after the filter chain?

@snowp
Copy link
Contributor

snowp commented Mar 22, 2021

I haven't reviewed the PR but I think it should probably happen before the filter chain, otherwise the filters have no way of observing the final downstream response. If we are trying to ensure that the response has a specific format after the filters are done it seems to me like it should be implemented as a filter that sits at the front of the filter chain, so we'd then make use of the conditional per filter mutation as mentioned elsewhere in this thread.

@esmet
Copy link
Contributor Author

esmet commented Mar 22, 2021

I haven't reviewed the PR but I think it should probably happen before the filter chain, otherwise the filters have no way of observing the final downstream response. If we are trying to ensure that the response has a specific format after the filters are done it seems to me like it should be implemented as a filter that sits at the front of the filter chain, so we'd then make use of the conditional per filter mutation as mentioned elsewhere in this thread.

Implementing this as a filter is possible in some sense (implemented in my first PR here #14221) but there were concerns about that not being the right approach, so I rewrote it in the HCM for this PR.

@snowp
Copy link
Contributor

snowp commented Mar 22, 2021

@esmet Sorry I haven't been following this issue, but I was thinking that we'd have the simpler response mapping happen before filters (like local reply mapping happens today) and then encourage the use of a more sophisticated filter/matcher to manage transformations throughout the filter chain (similar to what @mattklein123 suggests here #14926 (comment), just without the per filter HCM integration).

Overall I think it makes sense to be able to map upstream responses, but it seems the most sane for me to have it happen before the filters see the response since we do the same for local replies.

@goaway
Copy link
Contributor

goaway commented Mar 22, 2021

I've been keeping an eye on this issue, and while I don't have any strong opinions regarding placement of the implementation, I do feel it's worth mentioning that from a client perspective, we really do want filters to be seeing what actually will be sent.

OTOH, I don't know in what scenario this would actually be used in a client build, but I still think ideally we take an approach (as others have suggested) that maintains this visibility.

@github-actions
Copy link

This pull request has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in 7 days if no further activity occurs. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions!

@github-actions github-actions bot added the stale stalebot believes this issue/PR has not been touched recently label Apr 21, 2021
@github-actions
Copy link

This pull request has been automatically closed because it has not had activity in the last 37 days. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions!

@github-actions github-actions bot closed this Apr 28, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

stale stalebot believes this issue/PR has not been touched recently waiting

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants